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Abstract 

The  pace  at  which  fuel  cell  systems  are  widely  adopted  by  the  marketplace  will  be  determined  primarily  by  two  factors:  (1)  the  rate  at  which 
system  cost  decreases  and  (2)  the  rate  at  which  system  reliability  increases.  This  paper  describes  the  field  reliability  and  its  improvement 
through  a  combination  of  software  and  hardware  changes  of  Plug  Power’s  GenSys™  fleet  of  5  kWe  (plus  up  to  9  kW  of  thermal  energy)  proton 
exchange  membrane  (PEM)  fuel  cell  systems.  Plug  Power  has  shipped  more  than  300  of  these  systems  to  more  than  50  customer  locations  in 
more  than  10  countries.  This  fleet  is  of  sufficient  size,  and  has  been  operating  for  a  sufficient  length  of  time,  to  develop  statistically  significant 
observations  of  system  reliability.  Nondimensionalized  probability  plots  of  PEM  stack  lifetime  in  field  units  are  presented,  and  a  series  of 
system-level  changes  are  described  that  have  increased  PEM  stack  life  by  about  a  factor  of  4.  Nondimensionalized,  system-level  reliability 
statistics  are  also  presented  for  the  installed  fleet.  Pareto  charts  describing  the  top  causes  for  system  failures  in  the  field  are  shown,  and  the 
general  methodologies  for  improving  system-level  reliability  are  discussed. 
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1.  Introduction 

For  several  years,  Plug  Power  Inc.  has  been  developing, 
manufacturing,  and  selling  proton  exchange  membrane 
(PEM)  fuel  cell  systems  intended  for  stationary  power 
generation  in  residential  and  commercial  applications.  The 
GenSys™  family  of  products  is  one  of  Plug  Power’s  current 
offerings  in  this  market.  GenSys™  fuel  cell  systems  can 
be  fuelled  by  either  natural  gas  or  liquefied  petroleum  gas 
(LPG),  and  can  provide  both  heat  and  electric  power  to  the 
end  user.  Depending  upon  the  specific  model  and  installation 
configuration,  GenSys™  systems  are  capable  of  producing 
up  to  5kW  of  ac  electric  power,  5kWe  (at  either  50  or 
60  Hz)  and  up  to  9  kW  of  thermal  energy.  Plug  Power  has 
shipped  more  than  300  5  kWe  fuel  cell  systems  to  more  than 
50  customer  locations  around  the  world. 
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Barbir  [1]  discusses  the  overall  design  of  fuel  cell  sys¬ 
tems  for  stationary  power  generation  applications.  A  picture 
of  the  GenSys™  fuel  cell  system  is  shown  in  Fig.  1.  This 
fuel  cell  system  contains  five  major  modules,  numbered  1-5 
in  this  figure:  (1)  a  fuel  processing  module,  which  converts  a 
hydrocarbon  fuel  into  a  high  H2,  low  CO  content  reformate 
stream;  (2)  a  power  generation  module,  which  electrochemi- 
cally  converts  hydrogen  and  oxygen  into  water  inside  a  PEM 
fuel  cell  stack,  producing  both  electric  power  and  heat;  (3)  a 
power  electronics  module,  which  converts  the  dc  power  pro¬ 
duced  by  the  stack  into  ac  power;  (4)  an  electrical  energy 
storage  module,  which  ensures  continuity  of  electric  power 
during  transients;  and  (5)  a  thermal  management  module, 
which  transfers  usable  heat  to  the  customer. 

Inadequate  reliability  is  one  of  the  primary  factors  that  im¬ 
pede  the  large-scale  commercialization  of  proton  exchange 
membrane  fuel  cell  systems.  The  reliability  of  the  entire  fuel 
cell  system  depends  upon  the  reliability  of  the  fuel  cell  stack 
and  the  reliability  of  all  the  other  components  within  the  sys¬ 
tem.  Every  component  within  a  fuel  cell  stack  may  affect  the 
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Fig.  1.  A  GenSys™  fuel  cell  system.  Numbers  refer  to  modules  which  are 
described  in  Section  1. 

reliability  of  the  stack.  A  PEM  stack  is  normally  composed  of 
tens  to  hundreds  of  unit  cells,  and  these  unit  cells  are  stacked 
in  series  to  generate  the  power  and  voltage  required  by  the 
end  user.  A  failure  within  a  single  cell  may  require  the  en¬ 
tire  fuel  cell  system  to  be  shut  down.  In  addition,  failures  of 
upstream  components  within  the  fuel  cell  system  can  lead  to 
premature  stack  damage  and  failure. 

A  recent  review  of  membrane  electrode  assembly  (ME A) 
and  short  stack  reliability  is  available  [2],  and  a  general  dis¬ 
cussion  on  reliability  of  fuel  cell  stacks  can  be  found  in  Fowler 
et  al.  [3].  However,  there  is  much  less  information  in  the  liter¬ 
ature  on  the  reliability  of  PEM  fuel  cell  systems.  This  paper 
is  intended  to  help  fill  this  gap. 

Several  major  design  iterations  have  been  deployed  in  or¬ 
der  to  improve  fuel  cell  system  reliability  and/or  performance 
since  the  first  GenSys™  units  (referred  to  here  as  the  “Bl” 
units)  were  placed  in  the  field  in  August  2001.  Subsequent 
major  product  revisions  are  referred  to  as  the  “B2”-“B6” 
units  in  this  paper,  with  the  numbering  scheme  reflecting 
the  chronological  order  in  which  revisions  were  released. 
Table  1  summarizes  the  approximate  date  each  GenSys™ 
version  entered  production  and  the  number  of  units  manufac¬ 
tured.  The  most  recent  version,  B6,  is  still  in  production  (as 
of  December  2004)  and  continues  to  be  deployed  to  customer 
sites. 

The  installed  fleet  of  GenSys™  units  is  of  sufficient  size, 
and  has  been  operating  for  a  sufficient  length  of  time,  to 
enable  us  to  develop  statistically  significant  observations  of 

Table  1 


Summary  of  GenSys™  build  version  dates  and  production  volumes 


GenSys™  build 
version 

Production  launch 

date 

Number  of  units 
produced 

Bl 

August  2001 

33 

B2 

August  2001 

33 

B3 

October  2001 

40 

B4 

March  2002 

54 

B5 

September  2002 

45 

B6 

January  2003 

108 

system  reliability.  These  observations  help  identify  the  root 
causes  for  system  failures  in  the  field,  and  can  be  used  to 
prioritize  future  technology  development  needs.  The  rate  at 
which  system  reliability  is  improving  is  an  important  metric 
that  can  be  used  for  program  planning  purposes.  The  remain¬ 
der  of  this  paper  will  focus  on  these  topics. 

2.  Definitions 

Any  discussion  of  reliability  requires  a  common  under¬ 
standing  of  terms  and  nomenclature.  In  this  section,  we  define 
some  of  the  terms  that  will  be  used  in  the  subsequent  discus¬ 
sion  of  system  and  component  reliability.  A  failure  is  defined 
as  an  end-user  detectable  and  verifiable  loss  of  product  func¬ 
tionality,  resulting  in  an  unscheduled  repair  and/or  replace¬ 
ment  to  restore  the  lost  functionality.  For  example,  a  failure  is 
regarded  as  occurring  when  the  stack  voltage  becomes  lower 
than  a  predetermined  value  at  a  certain  power  output,  even  if 
the  fuel  cell  system  is  still  operable.  Product  reliability  is  the 
conditional  probability,  at  a  given  confidence  level,  that  the 
system  will  perform  its  intended  function(s)  without  failure 
for  a  specified  time  period  when  operated  under  proscribed 
usage  and  environmental  conditions.  Development  time  is  the 
total  accumulated  time  between  the  launch  of  the  first  prod¬ 
uct  version  (or  B 1)  and  any  subsequent  product  revision  (e.g., 
the  B4  version).  Cell  ratio  is  defined  as  the  ratio  of  the  lowest 
cell  voltage  to  the  mean  cell  voltage  within  a  stack. 

3.  Results  and  discussion 

3.1.  Overall  system  reliability 

The  evolving  reliability  of  the  GenSys™  fleet  is  shown  in 
Fig.  2,  which  plots  the  cumulative  average  number  of  failures 
per  system  as  a  function  of  development  time  for  units  that 
have  been  in  the  field  for  3  months  (A)  and  12  months  (■). 
The  data  in  this  figure  have  been  normalized  by  taking  the 
cumulative  average  number  of  failures  per  system  for  the 
Bl  units  after  12  months  as  1.0.  Cumulative  average  failures 
per  system  were  determined  by  fitting  a  Crow  power  law 
reliability  growth  model  [4]  to  the  raw  data  on  failures  for  the 
systems  considered.  The  units  within  each  product  revision 
had  acquired  an  average  of  5900-7900  h  of  field  run  time 
before  being  fit  to  the  Crow  power  law  model.  The  B2  unit 
reliability  is  not  shown  in  this  figure  because  the  B2  units 
were  released  the  same  month  as  the  B 1  units  and  had  the 
same  reliability. 

The  reliability  data  from  only  about  40%  of  the  units  listed 
in  Table  1  were  included  in  Fig.  2  for  a  variety  of  reasons. 
For  example,  most  units  were  installed  at  distant  locations 
and  required  remote  communication  capability  to  acquire  re¬ 
liability  data.  Some  of  these  units  experienced  data  corrup¬ 
tion  or  transmission  errors.  In  other  cases,  customers  simply 
declined  to  provide  the  necessary  reliability  data. 
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Fig.  3.  Most  frequent  component  failures  in  the  installed  fleet  during  October 
2002. 

circulation  scheme;  changes  to  the  PEM  stack  membrane 
electrode  assembly  (ME A);  and  the  elimination  of  selected 
sensors. 


Fig.  2.  Cumulative  failures  of  the  GenSys™  fuel  cell  system  fleet.  The  build 
version  from  Table  1  is  shown  next  to  each  point  for  clarity. 

Two  observations  are  immediately  apparent  from  Fig.  2. 
First,  units  that  have  been  in  the  field  for  12  months  have 
more  cumulative  failures  than  units  that  have  been  in  the 
field  for  only  3  months,  as  expected.  Second,  after  18  months 
of  continuous  development,  the  B6  systems  are  significantly 
more  reliable  than  the  B 1  systems.  Early  life  failures  (failures 
within  the  first  3  months  of  field  exposure)  were  reduced  by 
77%  from  the  B1  to  B6  versions.  After  12  months  of  field 
use,  the  projected  failures  from  the  B6  units  were  54%  lower 
than  the  B1  units. 

The  GenSys™  fuel  cell  system  is  a  repairable  system. 
In  other  words,  when  a  component  fails,  a  repair  is  made 
and  the  system  is  restored  to  operation.  For  repairable 
systems,  the  time  between  successive  failures  is  particularly 
interesting  because  the  reliability  can  be  modeled  using  a 
nonhomogeneous  Poisson  process  [5].  The  rate  of  improve¬ 
ment  in  system  reliability,  referred  to  as  a  “learning  curve”, 
can  be  modeled  by  this  process,  and  design  decisions  can 
be  made  which  affect  the  overall  system  reliability  [6]. 
By  monitoring  learning  curves  over  several  development 
programs,  the  similarities  across  programs  can  be  used  to 
guide  program  plans  and  evaluate  development  efforts.  For 
these  reasons,  we  believe  Fig.  2  conveys  a  significant  amount 
of  valuable  technical  information. 

The  overall  system  reliability  improvements  shown  in 
Fig.  2  were  achieved  through  a  combination  of  hardware  and 
software  changes  to  the  original  B 1  product.  These  changes 
not  only  improved  reliability,  but  also  decreased  system  cost 
by  ~50%  and  added  two  new  product  features.  The  two 
new  product  features  were:  grid  standby  capability,  which 
allows  the  unit  to  continue  to  power  critical  loads,  even  if 
the  local  electrical  grid  goes  down;  and  FPG  fuel  capability. 
Major  hardware  changes  implemented  in  this  time  period 
include:  a  new  inverter  design;  changes  to  the  stack  coolant 


3.2.  Component  reliability 

One  aspect  of  the  overall  system  reliability  improvement 
shown  in  Fig.  2  is  a  series  of  component-level  changes  de¬ 
signed  to  eliminate  failures.  Tracking  and  understanding  the 
failure  modes  and  failure  frequencies  of  system  components 
are  important  elements  in  improving  overall  system  reliabil¬ 
ity.  Figs.  3  and  4  are  “snapshots”  in  time  from  October  2002 
and  from  June  2004,  respectively,  illustrating  the  seven  most 
frequent  component  failures  in  the  GenSys™  fleet  during 
each  of  the  two  months  indicated.  The  data  from  October 
2002  come  from  a  sample  of  75  field  units,  while  the  June 
2004  data  come  from  a  different  sample  of  45  field  units. 
Since  the  sample  sizes  are  not  identical  (and  the  samples 
contain  different  build  versions),  the  component  failure  data 
in  each  figure  have  been  normalized  by  the  number  of  fail¬ 
ures  of  the  component  that  failed  most  frequently  during  the 
month  indicated.  The  run  time  for  the  units  in  each  sam¬ 
ple  ranged  from  about  4000  to  12,000  h.  Stack  failures  ex¬ 
ceeded  the  failures  of  any  other  individual  component,  and 
have  been  excluded  from  the  component  failure  data  shown 
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Fig.  4.  Most  frequent  component  failures  in  the  installed  fleet  during  June 
2004. 
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in  Figs.  3  and  4  to  highlight  the  reliability  of  other  parts  of 
the  system.  Stack  reliability  will  be  discussed  in  Section  3.3. 

Component  reliability  changed  significantly  from  the  B 1 
to  B6  systems.  A  comparison  of  Figs.  3  and  4  shows  that 
six  of  the  seven  most  frequent  component  failures  in  October 
2002  no  longer  made  the  top  seven  list  in  June  2004.  As 
the  most  frequent  component  failures  were  retired  through 
hardware  and/or  software  changes,  other  problems  percolated 
to  the  top  of  the  list.  Enterprise-wide  software  tools  were 
used  to  systematically  report  failures  and  track  problems  to 
resolution.  We  believe  that  tools  of  this  type  are  essential  to 
achieving  the  high  reliability  required  for  stationary  power 
generation  equipment. 

When  comparing  Figs.  3  and  4,  note  that  some  com¬ 
ponents  may  experience  more  than  one  failure  mode,  and 
design  changes  that  reduce  or  eliminate  one  failure  mode 
may  not  eliminate  all  failure  modes  caused  or  experienced  by 
that  component  (and  may  actually  create  new,  unanticipated 
failure  modes).  The  failures  shown  in  Figs.  3  and  4  are  failure 
categories  which  group  together  all  failures  of  a  particular 
component,  regardless  of  the  failure  mode.  In  other  words,  the 
reliability  of  a  component  can  be  improved  without  necessar¬ 
ily  eliminating  all  failures  associated  with  that  component, 
or  improving  the  overall  reliability  of  the  system.  Aging 
of  the  fleet  and  component  wear  can,  over  time,  cause  new 
problems  to  appear  with  higher  failure  rates  than  previous 
problems. 

3.3.  PEM  stack  reliability 

No  discussion  of  PEM  fuel  cell  system  reliability  is  com¬ 
plete  without  mentioning  the  reliability  of  the  fuel  cell  stack. 
We  have  found  that  a  Weibull  distribution  [7]  provides  a  rea¬ 
sonably  good  fit  to  failure  time  data  of  stacks  deployed  in  the 
field. 

Fig.  5  shows  Weibull  fits  to  field  stack  reliability  data 
from  the  B3-B6  systems.  System  run  time  in  Fig.  5  has 
been  normalized  by  the  time  required  for  the  reliability  of 


Fig.  5.  Weibull  distributions  of  GenSys™  fuel  cell  system  stack  reliability. 


B3  stacks  to  reach  zero.  Fig.  5  shows  that  the  time  required 
for  50%  of  the  stacks  to  fail  (i.e.,  the  median  stack  life)  has 
increased  by  more  than  a  factor  of  4  from  the  B3  to  B6 
builds. 

The  large  increase  in  median  stack  life  was  achieved 
through  a  combination  of  software  and  hardware  changes. 
The  dramatic  improvement  from  the  B3  to  B4  builds  was 
mainly  due  to  a  software  upgrade  and  control  algorithm 
changes.  The  new  software  and  control  algorithms  provided 
better  control  of  stack  coolant  inlet  temperature  by  using 
a  cascaded  proportional-integral-derivative  (PID)  control 
scheme.  The  coolant  temperature  change  across  the  stack  was 
put  under  closed  loop  control,  using  the  coolant  pump  speed 
as  the  adjustable  parameter.  Closed  loop  control  of  cath¬ 
ode  humidification  was  also  implemented.  These  software 
changes  enabled  more  precise  control  of  the  stack  coolant 
and  reactant  inlet  temperatures.  In  addition,  the  new  software 
and  control  algorithms  periodically  cycled  certain  movable 
components  to  prevent  them  from  sticking. 

The  improvement  in  stack  life  in  the  B5  and  B6  builds 
was  primarily  the  result  of  hardware  changes.  One  change 
reduced  cell-to-cell  temperature  variations  by  improving  the 
coolant  distribution  between  cells  within  the  stack.  A  hard¬ 
ware  change  in  the  fuel  processing  module  resulted  in  a  reduc¬ 
tion  in  the  reformate  CO  concentration.  Other  major  changes 
included  using  more  reliable  components  and  reducing  unit- 
to-unit  variation. 

For  each  GenSys™  fuel  cell  system  deployed  in  the  field, 
more  than  200  parameters  are  periodically  recorded.  Several 
stack  parameters  are  used,  in  conjunction  with  other  indi¬ 
cators,  as  monitors  of  stack  “health”.  Fig.  6  shows  two  of 
these  indicators,  stack  voltage  (normalized  by  the  maximum 
recorded  stack  voltage)  and  cell  ratio,  in  the  time  period  be¬ 
tween  January  and  December  2004.  The  data  in  Fig.  6  are 
from  a  single  fielded  system  that  was  commissioned  in  May 
2003  and  had  been  running  for  over  1 3,000  h  at  the  time 
this  paper  was  written.  From  a  linear  regression  of  the  stack 
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Fig.  6.  Performance  of  a  fuel  cell  stack  in  a  GenSys™  fuel  cell  system. 
As  of  December  2004,  this  system  has  been  running  in  the  held  for  over 
13,000  h. 
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voltage,  an  average  stack  degradation  rate  was  found  to  be 
3.5  (jiV  cell-1  h-1. 

Fig.  6  also  shows  that  the  cell  ratio,  which  is  defined  as  the 
ratio  of  the  lowest  cell  voltage  to  the  mean  cell  voltage,  was 
nearly  constant  before  July  2004.  This  observation  indicates 
that  the  performance  of  all  the  cells  was  about  the  same  during 
the  first  9500  operating  hours.  However,  starting  in  July  2004, 
the  cell  ratio  decreased,  indicating  that  the  voltage  of  one  or 
more  cells  decreased  much  faster  than  the  majority  of  other 
cells.  The  loss  of  an  adequate  voltage  in  one  or  more  cells  in 
a  GenSys™  stack  will  cause  a  stack  failure,  even  if  the  stack 
voltage  as  a  whole  is  still  acceptable. 

We  find  that  individual  cells  within  a  GenSys™  stack  of¬ 
ten  exhibit  similar  patterns  while  failing.  Failing  cells  often 
experience  a  slow  voltage  decay  over  a  long  time  period, 
much  like  other  cells.  Cells  that  will  eventually  fail  then  ex¬ 
perience  a  somewhat  faster  voltage  decrease  over  a  shorter 
time  period,  followed  by  a  very  rapid  loss  of  voltage  during 
a  short  time  period  (a  phenomenon  sometimes  referred  to  as 
“sudden  death”).  We  note  that  once  the  voltage  loss  within  a 
cell  starts  to  accelerate,  sudden  death  of  that  cell  is  normally 
not  far  away. 

The  acceleration  of  voltage  loss  is  consistent  with  the  for¬ 
mation  of  small  holes  in  the  ME  A.  We  believe  that  the  onset 
of  accelerating  voltage  decrease  corresponds  to  the  formation 
of  pinholes  in  the  ME  A.  Sudden  death  occurs  when  the  ME  A 
holes  reach  a  critical  size. 


4.  Conclusions 

Improving  the  reliability  of  complex  fuel  cell  systems  re¬ 
quires  problem  identification,  tracking,  and  resolution  at  the 
system  level.  High  stack  reliability  is  necessary  but  not  suf¬ 
ficient  to  guarantee  high  system  reliability,  as  the  failure  of 
other  components  within  the  fuel  cell  system  can  cause  a 
loss  of  product  functionality.  Over  an  18  month  period,  Plug 
Power  has  demonstrated  a  factor  of  2  improvement  in  the 
overall  reliability  of  GenSys™  fuel  cell  systems  while  si¬ 
multaneously  increasing  the  median  stack  life  by  a  factor  of 


4,  decreasing  product  cost  by  about  50%,  and  adding  new 
features  through  a  combination  of  software  and  hardware 
changes.  The  rate  at  which  system  reliability  was  improved  in 
this  fleet  of  fuel  cell  systems  can  be  used  to  develop  program 
plans  and  schedules.  The  Weibull  distribution  was  found  to 
provide  a  reasonably  good  fit  to  failure  time  data  of  stacks 
deployed  in  the  field.  Data  presented  here  on  component  re¬ 
liability  can  be  used  to  prioritize  future  research  and  devel¬ 
opment  needs. 
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